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Supplemental Discussion accompanying Axelsson et al. “The 
genomic signature of dog domestication reveals adaptation to 


a starch-rich diet” 


Supplementary Discussion, section 1. Experimental verification confirms high 
accuracy of SNP calling. 

Stringent filtering criteria (see Methods) were used to identify 3,786,655 SNPs in the 
combined dog and wolf data. Based on a high accuracy of SNP calls in a previous study 
(>95%) using a nearly identical SNP calling methodology we believe that the majority 
of these SNPs are true®®. To verify this, we designed an iPLEX assay targeting 124 
SNPs. We obtained reliable genotyping calls for 114 of these SNPs, out of which 113 
(99.1%) were confirmed to be variable (Table S3) in a panel of 71 dogs, representing 38 
diverse breeds, and 19 wolves of worldwide distribution (Table $14). The iPLEX result 
thus confirms a high accuracy of SNP calls. One position identified as a SNP by the 
SOLID data (chr26: 27,981,169 at which 10 reads support C and 3 support T) was 
found to be invariable in the iPLEX assay (only C’s called). Although it is possible that 
we failed to sample the variant allele in the iPLEX assay, this result may reflect SOLiD 


sequencing errors. 


Supplementary Discussion, section 2. Detecting signatures of selection in 200 Kb 
windows. 
Selection on a single advantageous mutation affects patterns of genetic variation at 


nearby loci causing (i) a reduction in heterozygosity, (ii) a skewed allele frequency 
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distribution and (111) an excess of high frequency derived alleles across a region 
surrounding the selected allele*’. We used two approaches to search for regions in the 
dog genome that bear patterns of genetic variation consistent with these signatures of 
selection. Using a 200 Kb sliding window we first calculated the average pooled 
heterozygosity, Hp, in dogs and wolfs separately (see Supplementary Discussion, 
section 8, for a detailed description of the wolf analysis), and, secondly, the average 
fixation index, sr, between the two taxa. Putatively selected regions were identified by 
extracting windows in the low end of the Z(Hp) distribution, and high end of the Z(Fsr) 
distribution, by applying a threshold at 5 standard deviations away from the mean. By 
applying these thresholds we extracted 38 windows, representing 14 unique regions 
with extremely low levels of heterozygosity in dog and 82 windows, representing 35 
unique regions, with strongly elevated F's; values. We partitioned the Fsr regions into 
those that are likely to represent selection in dog (n=30), wolf (n=3) or both taxa 
simultaneously (n=2), based on the corresponding Z(Hp) scores in the two taxa 
(Methods). A total of 36 unique autosomal CDRs were identified by the two approaches 
combined (Table S6, Fig. 1bc). 

As the variance of Hp and Fy will depend on the number of SNPs used for each 
calculation, spurious selection signals will be more likely in windows with few variable 
sites. To reduce the number of false positives we have therefore required that windows 
harbour a minimum of 10 variable sites to be included in the analysis. We tested a range 
of different window sizes (50, 100, 200, 500 and 1000 Kb) and note that a window size 
of 200 Kb results in a low number of windows with few SNPs; forty-nine out of 21,927 
200 Kb windows contained <10 SNPs, in contrast to 3,758 out of 87,873 50 Kb 
windows. Hence, using a window size of 200 Kb allowed us to screen a large fraction of 
the genome at a false positive rate that likely is lower than if a smaller window size had 


been used. 
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Even if windows harbour sufficient amounts of variable sites it may still be 
challenging to differentiate between signals caused by selection and genetic drift, 
respectively. In this regard two parameters - window size and Z-score threshold, can be 
tuned to reduce the risk of including false positives that are due to drift. Determining an 
optimal window size, in terms of maximising the sensitivity towards detecting selection 
events at a low cost of false positives, is however complicated given the complex and 
partly unknown demographic history of dogs. By comparing the distribution of Hp in 
analyses using window sizes of 50 and 200 Kb (Fig | and Fig S7), respectively, we note 
that the larger window size is associated with a small standard deviation (sd) (Sd2o0x» = 
0.054 vs. sdsoxp = 0.071) that may indicate that the outliers we have detected are 
enriched for rare and strong selective sweeps compared to if smaller window sizes had 
been used. Determining a Z-score threshold that maximises sensitivity at a low false 
positive rate is equally complicated, given the circumstances discussed above. We chose 
to set the thresholds at Z(Hp)<-5 and Z(Fsrz)>5 as they represent the extreme tails of the 
distributions and hence should be enriched for true signals of selection. It is however 
possible that windows with less extreme Z(Hp) and Z(Fsr) values may merit further 
investigation too, as they may have contributed to dog domestication as well. We 
suggest that the best way to evaluate the CDRs detected here is to seek confirmation of 
the observed signatures of selection in additional dog and wolf individuals, as well as 
test if there is evidence for altered gene activity or protein function that is associated 


with the genetic signal. 


Supplementary Discussion, section 3. Confirming signatures of selection. 
High sequence coverage and mapping quality across CDRs. 
During mapping and variant calling we have applied a number of filtering criteria (see 


Methods) to ensure that the data used for selection analyses is of high general quality. 
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Nevertheless, to test if CDRs may represent regions that are affected by experimental or 
technical artefacts more than other regions we compared CDRs and the genome as a 
whole, with regards to sequence coverage and mapping quality. 

The average pooled heterozygosity (Hp) is negatively correlated with sequence 
coverage (rho=-0.53, p<0.0001, (Spearman), Fig. S8) and the average sequence 
coverage is higher in regions of low heterozygosity compared to the genomic average 
(24.5 (Z(Hp)<-5) vs. 19.1 (genomic average), p<0.0001 (wilcoxon)). In line with these 
observations, Fst is positively correlated with sequence coverage (rho=0.17, p<0.0001, 
(Spearman), Fig. S9) and the average sequence coverage is higher in regions of high 
fixation index compared to the genomic average (24.5 (Z(Hp)<-5) vs. 19.1 (genomic 
average), p<0.0001 (wilcoxon)). The relatively high sequence coverage in CDRs 
reflects an inverse relationship between the theoretical minimum of allele frequencies 
and sequence coverage. Our method may thus be biased towards detecting outliers in 
regions of slightly higher sequence coverage, however this ensures that the evidence for 
putatively selected regions is strong. 

We also note that the root mean square mapping quality (RMS) of reads 
mapping within CDRs (RMScopr = 42) is slightly higher than the genomic average 
(RMScenome = 40.1, p<0.0001 (wilcoxon)). In summary, these results indicate that 


CDRs represent real outliers in terms of genetic varation. 
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Unequal sample size of dog and wolf. 
To reduce the risk of introducing biases due to unequal sample size in dog and wolf we 
have used a method to estimate F: 7 that takes differences in sample size into account. 

Nevertheless, to test if the unequal dog and wolf representation may have 
contributed to variance of statistics across the genome, we redid the F's; and Hp analyses 
based on a random sub sample of the dog data such that the average coverage in dog 
equalled that in wolf (6.2x) (Fig. S10 and S11). 

First, analysing the sub sampled data we detect 17 regions with Z(Hp)<-5, out of 
which 12 overlap the 14 regions from the original analysis. The 2 regions from the 
original analysis that did not overlap a region in the sub sampling analysis still show a 
clear reduction in heterozygosity (Z(Hp)<-4) in the sub sampled data. 

Secondly, we find 20 regions with Z(Fsr)>5 in the sub sampled data, out of 
which 18 overlap the 35 regions detected in the original analysis. The 17 regions from 
the original analysis that do not overlap a region in the sub sampling analysis still show 
a clear increase in Fs in the sub sampling analysis (8 regions have Z(Fs7)>4 and the 
remaining 9 regions have Z(Fst)>3). 

The majority of the original CDRs were thus identified as CDRs in the sub 
sampled data, and the remainder still stand out, although not significantly so. These 
results suggest that the unequal sample sizes of dogs and wolves likely have little effect 


on the overall results of our selection analyses. 


Confirming signatures of selection. 
We used several independent means and datasets to confirm that our methodology 
identifies regions that truly represent outliers in terms of genetic variation. 

First, as strong selection for an advantageous allele will lead to a sharp rise in 


frequency of liked genetic variation, it is expected that a significant fraction of variable 
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sites surrounding the selected allele will become lost due to fixation. Genetic signatures 
of selection are thus expected to show a reduction in the number of segregating sites. In 
line with this expectation we note that Hp is positively correlated with the average 
number of segregating sites in 200 Kb windows (rho=0.23, p<0.0001, (Spearman), Fig. 
S12) and that the average number of segregating sites is markedly lower in regions of 
low heterozygosity compared to the genomic average (71.2 (Z(Hp)<-5) vs. 253.3 
(genomic average), p<0.0001 (wilcoxon)). Similarly, we also observe that the average 
Fr is negatively correlated with the corresponding average number of segregating sites 
in 200 Kb windows (rho=-0.25, p<0.0001, (Spearman), Fig. S13) and that the average 
number of segregating sites is markedly lower in regions of high fixation index 
compared to the genomic average (97.8 (Z(F'sr)>5) vs. 244.9 (genomic average), 
p<0.0001 (wilcoxon)). 

We then genotyped 47 and 48 randomly selected SNPs across the MGAM and 
SGLTI regions, respectively in 71 dogs representing 38 diverse breeds and 19 wolves of 
worldwide distribution (the reference panel, Table S14). We noted a significant 
reduction in heterozygosity for dogs, as well as a significant increase in Fsr across these 
regions, consistent with the signals observed in the resequencing data (Main text, Fig. 3 
and S4). We also genotyping 17 additional diagnostic SNPs representing 13 more CDRs 
in the same panel of dogs and wolves as described above. The F'sr for these SNPs 
averaged 0.63, which is clearly above the genomic average of 0.22. This difference 
provides additional support that CDRs represent real outliers in terms of genetic 
differentiation between dog and wolf. 

We finally compared patterns of genetic variation within and outside CDRs in a 
panel of 507 dogs and 15 wolves that we recently genotyped using a high density SNP 
array (Illumina 170K HD canine array) (containing 173, 622 SNPs with an average 


spacing of 13 Kb)*'. We note a clear drop in SNP density (0.04/Kb in CDRs vs. 0.09/Kb 
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genome wide) and dog minor allele frequency (0.08 in CDRs vs. 0.24 genome wide), 
but only a slight decrease in wolf minor allele frequency (0.19 in CDRs vs. 0.29 
genome wide) in CDRs compared to the genome wide average (Table S22). We also see 
a clear increase in sr in CDRs (0.27) compared to the genome wide average (0.13) 
(Table S22). These patterns are consistent with selection acting specifically in the dog 
lineage. We also studied the detailed patterns of genetic variation near the MGAM and 
SGLTI regions using the HD-array data. Within the MGAM region we note four 
consecutive SNPs spanning the MGAM and TAS2R38 genes (chr.16:10,153,271; 
10,171,680; 10,182,601 and 10,212,709) that are completely fixed in all 507 dogs (Fig. 
S14), again corroborating the results of the resequencing data (Fig. 3). Within the 
SGLT1 region we note a single SNP (chr. 26: 27,964,669) that is completely fixed in 
dogs (Fig. S15). This SNP is located just downstream (558 bp) of the candidate 
causative amino acid substitution (chr 26: 27,964,111) (Main text). 

The consistency of the signals observed in the resequencing data and the two 
genotyping assays provide strong support for the ability of our method to identify 


potential signatures of selection affecting all, or nearly all dogs. 


Comparison of CDRs and signatures of selection detected in a previous study. 

Using a combination of Fs and haplotype based analyses VonHoldt and colleagues” 
identified a set of 14 regions that they argued may have been affected by selection 
during the initial phase of dog domestication. A comparison of the results of our study 
and those of VonHoldt et al. could thus represent an additional means to validate our 
results. We do however note that none of the regions detected by VonHoldt and 
colleagues overlap any of the CDRs detected here. We investigated this discrepancy in 
more detail by extracting the maximum Z(F'sr)-score, as well as the minimum Z(Hp)- 


score, recorded in our data among all 200 Kb windows that overlap each of the 14 
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VonHoldt regions, respectively. We find that the average maximum Z(Fsr) of the 14 
VonHoldt regions (avg. Z(Fs7)=1.83, range: -0.32 — 3.66) is only moderately elevated. 
Similarly, we note that the average minimum Z(Hp) of the 14 VonHoldt regions (avg 
Z(Hp)= -1.55, range: 0.53 - (-2.73)) represents a moderate drop in heterozygosity. How 
can this discrepancy between the results of the studies be explained? 

First, VonHoldt et al. based their calculations on genotyping data, in which 
SNPs were ascertained almost exclusively in dog (this is unlike our study where wolves 
have contributed significantly to the SNP discovery). Since selection affecting the 
ancestors of all modern dogs results in a reduction in the number of segregating sites 
across the affected region, it follows that regions affected by strong selection are 
unlikely to be represented on the genotyping array. Based on this bias, most of the 
regions affected by strong selection during early dog domestication would have been 
missed in the analysis of VonHoldt et al. Weak selection or partial sweeps (such as 
selection in a subset of dog breeds) may still have been detected using the genotype data 
(Fig. S16). 

Secondly, due to the low marker density many VonHoldt regions were 
represented by a relatively limited number of variable sites, which may have led to a 
large variance associated with the test statistic. Similarly, in a particular VonHoldt 
region, spanning ~250 Kb (Fig. S17), the wide spacing of markers may have lead the 
analysis of VonHoldt et al. to erroneously assume that markers from two separate 
regions with skewed allele frequencies represented one single signal. This may have led 
to a falsely inflated test statistic for this region. 

Finally, since we have used a window size of 200 Kb it is possible that we could 
have missed smaller regions. However, among the 14 VonHoldt regions, only four span 
just under 200 Kb and in none of these cases do we see any clear indications of 


selection spanning a narrow region in the resequencing data. 
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To summarise, it thus appears as if the regions detected by VonHoldt et al. 
represents weak selection events. Based on these conclusions we see no reason to 
question the validity of the CDRs detected here due to the divergent results of the two 


studies. 


Large overlap between Hp and F sy regions. 

Due to the way the fixation index and pooled heterozygosity are estimated, they are 
expected to be highly correlated. Yet, a number of CDRs are only detected as 
significant outliers using one of the statistics. To understand why this is the case we 
studied these regions in more detail. 

23 Fsr regions do not overlap regions of significantly reduced heterozygosity in 
either dog ((Z(Hp)poa)<-5) or wolf ((Z(Hp)woir)<-5). All of these regions do however 
still show clear reductions in heterozygosity, with an average Z(Hp) = -4.41 (range: - 
3.87 - (-4.94)) (calculated based on the lowest of the values in dog and wolf) (see Fig. 
S18). These regions thus all appear to be regions that nearly reach the threshold for 
significance in the Hp analysis. 

Four regions of reduced Hp in dog ((Z(Hp)pog)<-5) do not overlap an F's; region 
(Z(Fs1)>5). Again, all of these regions do however still show clear increases in F'sr, 
with an average Z(Fsr) = 4.56 (range: 4.48 — 4.64). This shows that the non-overlapping 
regions are border line cases that almost reach the threshold of significance in the Fsr- 
analysis. We note that in two of these regions, there are only 12 and 17 segregating sites 
(see Fig. S19), respectively (average number of segregating sites in regions of Z(F's7)>5 
is 97.8). This suggests that estimates for these particular borderline cases likely have 
higher variances that in turn could add to the explanation why they reach significance in 


the Hp-, but not the F’sz analysis. 
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Supplementary Discussion, section 4. Candidate domestication regions on 
chromosome X. 

The X-chromosome differs from autosomes in several population genetic aspects, 
including a reduction in effective population size and recombination rate. Via the effects 
of genetic drift this is expected to lead to a significant reduction in levels of genetic 
variation on the X-chromosome compared to the autosomes. In addition, a reduced 
mutation rate will further add to the depravation of genetic variation on chromosome X. 
Furthermore, as a result of the influence of drift it is also expected that the X- 
chromosome will show increased genetic differentiation between dog and wolf relative 
to on autosomes. To not confound the results of the main analyses due to these 
circumstances we decided to analyse chromosome X separately. We used the same 
methods for these analyses as described previously for the autosomes. As expected, we 
find that the average pooled heterozygosity, Hp, is lower (HpX: 0.30 vs. HpA: 0.33) and 
the average fixation index, F'sz, is higher (F’srX 0.31 vs. F'spA 0.20) on X compared to 
on autosomes (Fig. $20). We also note that the standard deviations of the Hp (oX: 0.096 
vs. GA: 0.054) and Fr (oX: 0.18 vs. cA: 0.091) distributions are larger on the X- 
chromosome relative to the autosomes, and that a relatively large proportion of the 
windows reside in the tails of the distributions (Fig. S20). Although this could indicate 
massive selection on the X-chromosome during dog domestication, it is likely that this 
mainly reflects a more prominent role of genetic drift on X compared to the autosomes. 
No windows passed the thresholds of significance (Z(Hp)<-5 or Z(F'st)>5) used for the 
autosomal analyses, again emphasising the difficulty of separating selection from drift 
on the X-chromosome. With this difficulty in mind, we nevertheless decided to extract 
the most extreme windows for further analyses as they represent clear candidates for 
selection on this chromosome. We did this by applying a threshold of 3 standard 


deviations from the mean of both the Hp and F’sz distributions. 13 windows, 
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representing 6 unique regions passed this cut-off in the Fs; analysis (Fig. S21, Table 
S23). No windows passed this threshold for the Hp analysis. A region spanning 200 Kb 
at chr.X: 42,905,187-43,105,187 represents the most extreme CDR in terms of absolute 
H, (0.015) and Fr (0.96) values recorded though out the entire dog genome (Table 
S23). This region harbours CCNB3 that codes for the G2/mitotic-specific cyclin-B3, 
which is essential for oocyte maturation”. It is thus possible that selection for variants 
of this gene could be associated with the reproductive changes accompanying dog 


domestication, for example changes from one to two estrous cycles per year. 


Supplementary Discussion, section 5. Amylase CNV break point suggests 
duplications arose via unequal crossing over using L1 LINE elements as template. 
A CDR on chromosome 6 coincides with a sharp increase in coverage at the locus 


44.45 and real time 


coding for amylase (Fig. 2). Two independent CNV detection methods 
quantitative PCR (Fig. 2c, Table S11) confirm a drastic increase in amylase copy 
numbers in dog relative to wolf. This copy number difference is further supported by a 
4-8-fold increase in aligned read depth in dog relative to wolf across 5 additional alpha- 
amylase copies residing on three unmapped contigs (Fig. S22). Two of these copies 
reside on individual unplaced contigs while three are annotated in tandem on the same 
unplaced contig (chromosome Un) (Fig. S22). The close proximity of the three 
neighboring copies suggests that the alpha-amylase CNV arose via tandem duplications 
as a result of unequal crossing—over between neighboring homologous DNA sequences. 
To identify the likely target for the initial unequal crossing over event we sought to map 
the break point between individual copies by studying sequence coverage across the 
three chromosome Un contigs. A clear increase in dog relative to wolf coverage 


extended to the contig ends in all cases except the upstream end of the contig 


harbouring three amylase copies. We estimated the average sequence coverage in 1kb 
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windows and identified the first window in this contig for which the ratio of dog to wolf 
coverage exceeded 3. This window is located between positions 46,696,000 and 
46,697,000 on chromosome Un. We compared sequence coverage at individual sites 
across a 10 Kb region spanning this window and noted a clear increase in coverage just 
downstream of position 46,696,000 (data not shown). This position harbours a L1 LINE 
element that reoccurs between all three amylase copies on this contig as well as 
upstream of the two additional annotated amylase copies. It is thus likely that L1 LINE 
elements located on opposite sides of the ancestral amylase copy served as templates for 
an unequal crossing over event that caused the initial duplication event from which 
several additional duplications followed. In support of this, we recently identified L1 
LINE elements as the most highly enriched sequence feature near CNVs in the dog 


46 
genome’. 


Supplementary Discussion, section 6. Chromosome 16 sweep likely represents two 
independent selection events. 

A single 200 Kb window (chr. 16: 10,107,390-10,307,390) spanning the MGAM locus 
has a significant reduction in average pooled heterozygosity (Hp). This putatively 
selected region is however extended by the Fy analysis to include 300 Kb upstream 
sequence, thus in total spanning a 500 Kb region (chr. 16: 9,807,391-10,307,390). By 
studying heterozygosity and Fs; estimates for individual sites across this large region 
(Fig. 3b) we note a short, highly variable section (spanning approximately chr. 16: 
10,055,000-10,095,000) that separates this region into two smaller regions. We believe 
that this reflects a true increase in genetic variation rather than alignment artefacts for 
several reasons. First, the region spanning this highly variable 40 Kb region is syntenic 
in the human genome, arguing against a miss assembly in the dog reference genome. 


Secondly, we see no average difference in sequence coverage between wolf and dog, 
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nor any other evidence indicative of a CNV in this region. Finally, the frequency of 
alternatively oriented mate-pairs reads in this region does not differ from surrounding 
sequence. This argues that, rather than representing a single sweep, the 500 Kb region 
represents two independent selection events, one of which affects the MGAM locus 
(which is discussed in detail in the main text), and a second affecting the nearby T-cell 
receptor cluster. T-cell receptors play crucial roles in the immune system by recognizing 
antigens bound to class I or class II major histocompatibility proteins and given that 
immune system genes are frequent targets of selection” it is not unexpected that 
selection may have affected the T-cell receptor cluster in dog. We searched for 
candidate mutations that may have been targeted by this sweep and observed a non- 
conservative amino-acid substitution leading to a shift from glycine in wolf, to glutamic 
acid in dog, at residue 61 (chr. 16: 9,852,935) of the T cell receptor beta variable 2 
(TRBV2). The resequencing data indicate that this represents a fixed difference between 
dog and wolf as all canine reads (n=36) support a C, while all wolf reads (n=6) support 
a T at this position. We confirmed a high degree of differentiation by genotyping this 
mutation in 72 dogs representing 38 diverse breeds and 21 wolves of worldwide origin 
(Table S14). Among dogs genotyped, all except two West Highland White Terriers 
carried at least one copy of the C allele: 62 dogs were homozygous and 8 heterozygous 
for this allele. 17 out of 21 wolves were homozygous and 2 were heterozygous for the T 
allele while two wolves were homozygous for the C allele. In addition, we note that 
three consecutive SNPs (chr. 16: 9,874,039; 9,890,639 and 9,967,827) on the Illumina 
170 K Canine HD array located just downstream of the candidate mutation are fixed in 
502 dogs (Fig. S14). 

The high degree of dog—wolf differentiation in combination with the non- 
conservative nature of this amino-acid substitution argues that this represents the target 


of the selective sweep. The fact that the mutation affects a receptor responsible for 
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antigen binding suggest that this change may reflect exposure to new pathogens as 


ancestral dogs shifted to a life in close proximity to humans. 


Supplementary Discussion, section 7. CDRs are not enriched for CNVs. 

To test if CNVs might have played a major role in forming the molecular basis for 
adaptations during dog domestication we compared the relative abundance of CNVs 
that are present in all dog breeds, but lacking in wolf, in CDRs (n=8) and the genome as 
a whole (n=881). Based on the genome wide estimates of CNV abundance, the expected 
number of CNVs per 200 Kb window is 0.038 and the observed number in 200 Kb 
windows in CDRs is 0.032. We conclude that there is no evidence for an enrichment of 


these CNVs in CDRs (p=0.66, Chi square test). 


Supplementary Discussion, section 8. Detecting signatures of selection in wolf. 
Although it is possible that some wolf populations have experienced severe bottlenecks 
recently, the significantly larger effective population size in wolf relative to in the 
ancestral pre-breed dog population**, suggests that wolves suffered to a less extent than 
dogs from historical bottlenecks. Selection and drift should thus be more easily teased 
apart in wolf relative to in dog, arguing that our data, despite including only a single 
wolf pool, represents a first opportunity to detect signatures of recent selection through 
out the wolf genome. In agreement with this prediction the distribution of average 
pooled heterozygosity (Hp) in 200 Kb windows across the autosomal part of the wolf 
genome forms a narrow peak centred around the mean (avg. autosomal Hp: 0.280), with 
a few clear outliers harbouring very low levels of genetic variation. We applied the 
same threshold for identifying outliers (Z(Hp)<-5) as in the dog analysis and extracted 
23 windows, representing 18 unique regions with extremely low levels of 


heterozygosity in the wolf genome (avg. length: 272 Kb, avg. Hp: 0.049 (0.013-0.067), 
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(Table S24, Fig. 1, S2 and S23). We combined these results with those of the previously 
described F's; analyses (see Supplementary Discussion, section 2) to detect 21 unique 
wolf candidate selection regions (wolf CSR) containing 113 annotated genes. To 
formally characterise the function of genes in wolf CSRs we searched for 
overrepresented GO-terms associated to these genes and find several terms related to 
intracellular signalling and protein kinase cascades (Table S25). For example, in a wolf 
CSR on chromosome 13 we note STK3 that encodes the stress activated 
Serine/threonine-protein kinase 3 that mediates apoptotic signals”. It is unclear what 
may have triggered the potential selection for altered signal transduction, although 
stress related to pathogens or other environmental stimuli are potential candidates. 

Two wolf CSRs encompass single genes, which may hence represent the target 
of the putative selection; ADCY/0 encodes adenylyl cyclase 10 that may play a role in 


°°! and Neurobeachin codes for a protein that is involved in body 


sperm maturation 
weight control, presumably via an effect on feeding behaviours. 

Apart from casting light on the recent evolutionary history of the wolf itself, 
studying selection in wolves serves to contrast the results of the dog analyses. By 
comparing the results of the GO-analyses of regions under the potential influence of 
selection in dog and wolf, respectively, we find no apparent functional overlap, 
indicating that distinct selective pressures have been operating in the two lineages. The 
result of the wolf analysis thus confirms that selection for efficient starch digestion and 
a potentially altered nervous system development is unique to the dog lineage. Based on 
the divergent results of the dog and wolf analyses, it is furthermore unlikely that 


methodological problems related to the GO-analysis have biased the results of the dog 


analysis significantly. 
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Supplementary Discussion, section 9. The genetic relationship of dog and wolf 
pools. 

We used the allele frequencies at the 3,786,655 SNPs in individual pools to construct a 
phylogenetic tree (Fig. S24) that summarises the genetic relationship of wolves and the 
five dog pools”. This analysis shows that all dog pools are more closely related to each 
other than to the wolf pool and that no dog pool can be considered more ancestral (or 
ancient) than other pools. This agrees with earlier results suggesting the wolf is the 
ancestor of all dogs and that the formation of most modern breeds happened in a short 


time frame starting with the same relatively limited breeding stock. 
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Supplemental Figures and Legends 
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Figure S1. Distribution of sequence coverage for the five dog pools combined and a 
single wolf pool. 
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Figure S82. Distribution of heterozygosity and fixation index. Distribution of average 
pooled heterozygosity in dog (Hp-poc) and wolf (Hp-wotr) respectively, as well as 
average fixation index (Fr), for autosomal 200 Kb windows (o, standard deviation; u, 


average). 
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Figure S3. AMY2B expression measured in pancreas is significantly correlated 
with copy number in 12 wolves and 8 dogs (rho=0.84, p<0.0001, Spearman). 
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Figure S4. Serum amylase activity is significantly correlated with AMY2B copy 
number in 7 wolves and 11 dogs (rho=0.63, p=0.0053, Spearman). 
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Figure S5. Selection affected main glucose transporter SGLT1. a) Pooled 
heterozygosity, Hp (blue dots), and average fixation index, F'sr (orange dots) plotted for 
200 Kb windows across a region spanning 20-35 Mb on chromosome 26. Dashed 
vertical lines indicate the location of the selected region harbouring SGL77/. b) 
Magnification of the region affected by selection showing heterozygosity, H (blue dots) 
and fixation index, F's; (orange dots) estimated for single SNPs. Dashed horizontal lines 
delineate genotyped region shown in panel c. c) Haplotypes inferred from genotyping of 
48 SNPs in 71 dogs and 19 wolves, shows the location of a 50.5 Kb region, spanning 
approximately 27.96-28.01 Mb, that is nearly fixed in all dogs. Red colour represents 
the major dog allele, while blue is minor dog allele. Genes residing in the genotyped 
region are shown below panel c. 
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Figure S6. SGLT1 is not differentially significantly expressed in dog and wolf 
pancreas (NwoLr=4, Npoc=9, P=0.39, Wilcoxon). 
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Figure S7. Distribution of pooled heterozygosity, Hp and average fixation index, 
Fy, and corresponding Z-tranformations, Z(Hp) and Z(F's1), estimated in 50 Kb 
windows across all dog autosomes. The standard deviation (co) of the autosomal 
average ({1) is indicated for each distribution. 
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Figure S8. The average Hp of all autosomal 200 Kb windows plotted against the 
corresponding average read coverage. Hp is negatively correlated with sequence 
coverage (rho=-0.53, p<0.0001, (Spearman)). 
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Figure S9. The average Fz of all autosomal 200 Kb windows plotted against the 
corresponding average read coverage. Fsz is positively correlated with sequence 
coverage (rho=0.17, p<0.0001, (Spearman)). 
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Figure S10. Distribution of average pooled heterozygosity, Hp and average fixation 
index, Fs7, and corresponding Z-tranformations, Z(Hp) and Z(fsr) estimated in 
200 Kb windows across all dog autosomes. The analyses are based on the original 
wolf data and a sub sampled dog data set that matches the wolf data in terms of 
sequence coverage. 
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Figure S11. The positive end of the Z(F'sr) distribution plotted along dog autosomes 1- 
38 (chromosomes are separated by colour). A dashed horizontal line indicates the cut- 
off (Z>5) used for extracting outliers. The negative end of the Z(Hp) distribution plotted 
along dog autosomes 1-38. A dashed horizontal line indicates the cut-off (Z<-5) used 
for extracting outliers. The analyses are based on the original wolf data and a sub 
sampled dog data set that matches the wolf data in terms of sequence coverage. 
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Figure S12. The average Hp of all autosomal 200 Kb windows plotted against the 
corresponding number of segregating sites per window. > is positively correlated 
with the number of segregating sites (rho=0.23, p<0.0001, (Spearman)). 
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Figure S13. The average Fsr of all autosomal 200 Kb windows plotted against the 
corresponding number of segregating sites per window. Fst is negatively correlated 
with the number of segregating sites (rho=-0.25, p<0.0001, (Spearman)). 
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Figure S14. Genetic variation in a region spanning 9.6 - 10.8 Mb on chr. 16 that 
encompass the MGAM CDR and the nearby T-cell receptor cluster CDR (located 
at approx. 9.8 — 10 Mb). 507 dogs and 15 wolves were genotyped using the Illumina 
170K Canine HD array. A dashed horizontal line separates dogs from wolves. Short tick 
marks represents individual SNPs. Long tick marks indicates position in Mb. Red 
colour: homozygous A-allele; green colour: heterozygous; blue colour: homozygous a- 
allele, white colour: missing genotype call. Four SNPs spanning the MGAM and 
TAS2R38 genes are completely fixed in all dogs. 
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Figure S15. Genetic variation in a region spanning 27.8 — 28.2 Mb on chr. 26 that 
encompass the SGLT1 CDR. 507 dogs and 15 wolves were genotyped using the 
Illumina 170K Canine HD array. A dashed horizontal line separates dogs from 
wolves. Short tick marks represents individual SNPs. Long tick marks indicates position 
in Mb. Red colour: homozygous A-allele; green colour: heterozygous; blue colour: 
homozygous a-allele, white colour: missing genotype call. A single SNP in the SGLT/ 
gene is completely fixed in all dogs. 
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Figure S16. Example of a VonHoldt region with moderate evidence of selection in 


the pooled resequencing data (maximum Z(Fsr) = 3.664, minimum Z(Hp) = -4.04). 
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Figure S17. Example showing how markers from two narrow regions, with skewed 
allele frequencies, may have been mistaken to represent a single region spanning 
~280 Kb in the analysis by VonHoldt et al. (maximum Z(F 7) = 2.67 , minimum 


Z(Hp) = -1.47). 
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Figure S18. Example of a significant F's; region (Z(Fsr)>5) that did not pass the 
Z(Hp) threshold (Z(Hp)<-4.60). 
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Figure S19. Example of a significant Hp region (Z(Hp)<-5) that did not pass the 
Z(Fsr) threshold (Z(Fsr)=4.53). In this particular region the most extreme 200 Kb 
window in terms of Hp only harboured 12 segregating sites. 
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Figure S20. Distribution of pooled heterozygosity, Hp and average fixation index, 
Fsz, and corresponding Z-tranformations, Z(Hp) and Z(F's1), estimated in 200 Kb 
windows across chromosome X. The standard deviation (o) of the autosomal average 


(u) is indicated for each distribution. 
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Fig S21. Z-transformed average fixation index (only positive values shown), Z(F'sr) 
and pooled heterozygosity (only negative values shown), Z(Hp), in 200 Kb windows 
across chromosome X. Dashed horizontal lines show Z(Fs7)>3 and Z(Hp)<-3, 


respectively. 
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Figure S22. Sequence coverage averaged across 1 Kb windows in dog (blue line) 
and wolf (red line) at five annotated amylase gene copies residing on chromosome 
unknown. The relative dog to wolf coverage (green line) shows a 4 to 8 fold increase 
(half the ratio is plotted) across the amylase copies, indicative of a significant increase 
in copy number in dog. 
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Figure S23. The positive end of the Z(F's7) distribution plotted along autosomes 1-38 
(chromosomes are separated by colour). A dashed horizontal line indicates the cut-off 
(Z>5) used for extracting outliers. The negative end of the Z(Hp) distribution in wolf 
plotted along autosomes 1-38. A dashed horizontal line indicates the cut-off (Z<-5) used 
for extracting outliers. 
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Figure S24. Phylogenetic tree depicting the relationship of the 5 dog pools (pool2- 
6) and the single wolf pool (pool). Pools 2, 3 and 6 are a mixture of breeds, and these 
have shorter branches showing more shared variation than the single breed pools. 
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Supplemental Tables 


Table S1. Pool information for resequencing data from dogs and wolves. A single 
wolf pool represents wolf diversity across Eurasia and North America, three dog pools 
represent four separate breeds respectively and the remaining two dog pools contain 
DNA from representatives of a single breed. The number of individuals from each breed 
(n) and the average sequence and assembly coverage per pool is indicated. 
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Total 29.8x 94 
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Table S2. SNP summary statistics. Table shows the number of SNPs called in all dog 
and wolf data combined (SNP count); the number of SNPs covered by at least one 
sequencing read in wolf (Wolf coverage) and dog (Dog coverage); number of SNPs 
segregating only in dog (Private to dog) and wolf (Private to wolf); number of SNPs 
fixed in dogs and at the same time either fixed or segregating for the alternate allele in 


wolves (Fixed). “Fixed” refers to SNPs with allele frequency > 0.95. 
Fixed 

Wolf Dog Private Private between 

Chr. SNP count | coverage coverage | to dog to wolf wolf dog 
1 185673 180288 185070 86448 8341 1575 
2 120703 116954 120287 56306 4653 807 
3 157222 152698 156727 74600 5995 1136 
4 149078 144899 148609 710776 5572 1021 
5 146796 142581 146365 69744 5078 921 
6 123932 120182 123535 59461 4584 961 
7 128151 124714 127747 59214 4555 772 
8 119933 116405 119461 55274 4252 754 
9 80984 78157 80760 39005 3064 642 
10 102917 99712 102583 49653 4483 933 
11 113233 110204 112851 51840 4797 811 
12 126438 122677 126056 59529 4799 879 
13 112292 109241 111944 51293 3839 706 
14 102850 100098 102496 47457 3592 645 
15 99480 96653 99163 47229 4080 760 
16 103571 100413 103208 47653 3798 804 
17 110635 107581 110268 51414 3764 669 
18 88670 85779 88404 42447 4682 1128 
19 102612 99717 102284 46984 3422 566 
20 82422 80021 82170 37474 2831 507 
21 98244 95538 97900 45529 2883 472 
22 102249 99438 101876 47366 3634 683 
23 93155 90729 92854 41414 2903 469 
24 77485 75048 77230 36819 2725 503 
25 87717 85157 87460 42343 3428 745 
26 70446 68159 70184 32953 2111 407 
27 81738 79398 81454 37809 2195 359 
28 69287 67126 69089 32497 2515 473 
29 81532 79372 81267 37236 2306 363 
30 64607 62583 64415 29444 1713 285 
31 76801 74411 76514 35789 2724 539 
32 77181 74947 76904 34530 2298 349 
33 59705 58021 59517 27568 2128 366 
34 83556 81072 83308 38564 2793 472 
35 59015 57419 58807 26476 1416 223 
36 55354 53823 55170 25170 1958 359 
37 54015 52460 53838 24531 2106 393 
38 53605 51836 53406 24578 1488 265 
Xx 83371 80095 83182 46492 7313 3025 

3675606 3774363 | 1770909 140818 27747 

total 3786655 (97.1%) (99.7%) | (46.8%) (3.7%) (0.7%) 
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Table S3. 99.1% of SNP validated using iPLEX technology. 


Assays Number of SNPs 

Total number of SNP assays 124 
Failed SNPs assay 0 
Excluded SNPs due to low call rates 10 
Erroneous base call in genome assembly 0 
Total number of successful SNP assays 114 
Erroneous base call in SOLiD data 1 
Verified SNPs 113 


Table S4. Ranking of 200 Kb windows in the dog genome based on a significantly 
reduced average pooled heterozygosity, Hp, sorted by Z-sore. CDR indicates which 
candidate domestication region the window is part of. Z-score refers to the value of the 
window after Z-transformation of the Hp distribution. Ensemble ID and gene name or 


gene description is shown for genes residing in these windows. 


Position 
CDR (Chrom:Mb) Ap Z-score Ensemble ID 
12 18: 3.9-4.1 0.015 -5.82 
14 25: 4.1-4.3 0.016 -5.79 
2 1:6.0-6.2 0.018 -5.76 ENSCAFG00000000017 
13. 18: 6.3-6.5 0.018 -5.75 
10 16: 10.1-10.3 0.022 -5.68 ENSCAFG00000003841 
10 16: 10.1-10.3 0.022 -5.68 ENSCAFG00000003856 
10 16: 10.1-10.3 0.022 -5.68 ENSCAFG00000003864 
10 16: 10.1-10.3 0.022 -5.68 ENSCAFG00000003872 
10 16: 10.1-10.3 0.022 -5.68 ENSCAFG00000003876 
10 16: 10.1-10.3 0.022 -5.68 ENSCAFG00000003879 
13. 18: 6.5-6.7 0.024 -5.65 
7 11: 40.9-41.1 0.024 -5.65 
13. 18: 6.9-7.1 0.025 -5.62 
13. 18: 7.0-7.2 0.027 -5.59 
4 6: 27.1-27.3 0.03 -5.54 ENSCAFG00000017807 
4 6:27.1-27.3 0.03 -5.54 ENSCAFG00000017810 
4 6: 27.1-27.3 0.03 -5.54 ENSCAFG00000017814 
4 6:27.1-27.3 0.03 -5.54 ENSCAFG00000017819 
4 6:27.1-27.3 0.03 -5.54 ENSCAFG00000017844 
4 6: 27.1-27.3 0.03 -5.54 ENSCAFG00000023577 
11 16: 11.9-12.1 0.031 -5.52 ENSCAFG00000004011 
11 16: 11.9-12.1 0.031 -5.52 ENSCAFG00000004038 
12 18: 3.8-4.0 0.032 -5.51 
2 1:6.1-6.3 0.032 -5.49 
13 18: 6.4-6.6 0.034 -5.47 
5 7: 27.9-28.1 0.034 -5.46 ENSCAFG00000025 140 
5 7: 27.9-28.1 0.034 -5.46 ENSCAFG00000025165 


Gene 


ZNF 236 


MGAM 
Q24BD2_CANFA 
CLECSA 


XM_539875.1 


CRYM 
ANKS4B 
ZP2_CANFA 
TMEM159 
DNAH3 
XM_547099.2 
TBXASI 
AIPK2 


RAB GTPase activating 
protein I-like 


RAB GTPase activating 
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protein I-like 


13. 18: 7.1-7.3 0.035 -5.45 
13. 18: 6.2-6.4 0.037 -5.42 
RAB GTPase activating 

5 7: 27.8-28.0 0.039 -5.38 ENSCAFG00000025165 protein I-like 
14 25: 4.0-4.2 0.039 -5.38 ENSCAFG00000005968 

6 10: 6.8-7.0 0.04 -5.36 

8 11: 50.4-50.6 0.042 -5.32 

1 1:5.5-5.7 0.043 -5.29 

8 11: 50.3-50.5 0.048 -5.21 

8 11: 50.5-50.7 0.048 -5.21 ENSCAFG00000001734 
12 18: 3.7-3.9 0.049 -5.18 

9 11: 56.9-57.1 0.049 -5.18 ENSCAFG00000002386 FRMPDI1 

9 11: 56.9-57.1 0.049 -5.18 ENSCAFG00000002389 RGIMTD3 
14 25: 4.2-4.4 0.05 -5.16 

6 10: 6.6-6.8 0.051 -5.15 

6 10: 6.7-6.9 0.053 -5.12 

6 10: 6.9-7.1 0.054 -5.1 ENSCAFG000000003 19 ASPD1 
12 18: 3.6-3.8 0.054 -5.09 

3 4: 44.0-44.2 0.054 -5.09 ENSCAFG00000016919 TLX3 

3. 4: 44.0-44.2 0.054 -5.09 ENSCAFG00000024671 XM_536433.2 
12 18: 4.1-4.3 0.055 -5.08 ENSCAFG00000003354 VWC2 

2 1:5.9-6.1 0.055 -5.07 ENSCAFG00000000016 MBP 

2 1:5.9-6.1 0.055 -5.07 ENSCAFG00000000017 ZNF 236 

7 11: 40.8-41.0 0.056 -5.06 

2  1:5.8-6.0 0.057 -5.04 ENSCAFG00000000016 MBP 

2  1:5.8-6.0 0.057 -5.04 ENSCAFG00000000017 ZNF 236 

5 7: 27.6-27.8 0.058 -5.02 ENSCAFG00000014399 XM _856082.1 
13. 18: 6.7-6.9 0.058 -5.02 


Table S5. Ranking of 200 Kb windows in the dog genome based on a significantly 
increased fixation index between dog and wolf, Fst, sorted by Z-sore. CDR 
indicates which candidate domestication region the window is part of. Z-score refers to 
the value of the window after Z-transformation of the Fst distribution. Ensemble ID and 
gene name or gene description is shown for genes residing in these windows. 


Z- 
CDR Position Fr score Ensemble ID Gene 
30 25: 4.1-4.3 0.9 7.74 
23 16: 10.1-10.3 0.89 7.65 ENSCAFG00000003841 MGAM 
23 16: 10.1-10.3 0.89 7.65 ENSCAFG00000003856 Q2ABD2_CANFA 
23 16: 10.1-10.3 0.89 7.65 ENSCAFG00000003864 CLECSA 
23 16: 10.1-10.3 0.89 7.65 ENSCAFG00000003872 
23 16: 10.1-10.3 0.89 7.65 ENSCAFG00000003876 XM_539875.1 
23 16: 10.1-10.3 0.89 7.65 ENSCAFG00000003879 
14 7: 27.6-27.8 0.86 7.25 ENSCAFG00000014399 XM_856082.1 
25 18: 3.9-4.1 0.85 7.17 
12 6: 50.0-50.2 0.85 7.16 ENSCAFG00000019972 RNPC3 
20 14: 10.3-10.5 0.84 7.06 ENSCAFG00000001524 AHCYL2 
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14: 10.3-10.5 
14: 10.3-10.5 
25: 4.0-4.2 
18: 3.8-4.0 
1: 6.0-6.2 
18: 6.9-7.1 
18: 7.0-7.2 
3: 18.2-18.4 
18: 6.5-6.7 
18: 7.1-7.3 
18: 3.7-3.9 
26: 27.9-28.1 
26: 27.9-28.1 
26: 27.9-28.1 


7: 27.7-27.9 
25: 4.2-4.4 
18: 3.6-3.8 
4: 44.0-44.2 
4: 44.0-44.2 
1: 5.9-6.1 

1: 5.9-6.1 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 
16: 9.8-10.0 


16: 9.8-10.0 
6: 28.1-28.3 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 
16: 9.9-10.1 


0.84 
0.84 
0.84 
0.84 
0.83 
0.82 
0.82 
0.81 

0.8 

0.8 

0.8 
0.79 
0.79 
0.79 


0.78 
0.77 
0.76 
0.76 
0.76 
0.76 
0.76 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 


0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
0.75 
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ENSCAFG00000001528 
ENSCAFG00000001531 
ENSCAFG00000005968 


ENSCAFG00000000017 


ENSCAFG00000008064 


ENSCAFG00000013439 
ENSCAFG00000013498 
ENSCAFG00000023285 


ENSCAFG00000025165 


NSCAFG00000016919 
NSCAFG00000024671 
NSCAFG00000000016 
NSCAFG00000000017 
NSCAFG0000000381 1 
NSCAFG00000003812 
NSCAFG00000003814 
NSCAFG00000003815 
YSCAFG00000003817 
NSCAFG00000003818 
NSCAFG00000003820 
NSCAFG00000003823 
NSCAFG00000003827 
YSCAFG00000014478 
NSCAFG00000024325 
NSCAFG00000024754 
NSCAFG00000024805 
NSCAFG00000024808 
NSCAFG00000024810 


Z 


See Se eee ee ee eee ee 


NSCAFG00000024819 
NSCAFG00000018025 
NSCAFG00000003814 
NSCAFG00000003815 
NSCAFG00000003817 
NSCAFG00000003818 
NSCAFG00000003820 
NSCAFG00000003823 
NSCAFG00000003827 
NSCAFG00000003841 
NSCAFG00000024325 
NSCAFG00000024754 


XM_856148.1 
SMO 


ZNF 236 


FAMI172A 


SGLTI 
SGLT3 


LOC612066 
RAB GTPase activating 
protein 1-like 


TLX3 
XM_536433.2 
MBP 
ZNF236 


T cell receptor beta variable 9 


TRY1 CANFA 
XM_846299.1 
PRSS58 

XM_539871.2 


T cell receptor beta constant 2 


T cell receptor beta variable 2 
T cell receptor beta variable 


19 
GPR139 


TRY1 CANFA 
XM_846299.1 
PRSS58 
XM_539871.2 
MGAM 
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6: 
1: 
1: 


18: 
37: 
37: 
Sie 
37: 
37: 
37: 


1: 
10 
15 
1: 


> 9.9-10.1 

> 9.9-10.1 

> 9.9-10.1 

> 4.1-4.3 

: 6.4-6.6 
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> 41.7-41.9 
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: 6.2-6.4 

: 6.7-6.9 
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1 22.9-23.1 
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: 6.6-6.8 
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: 8.2-8.4 
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: 40.7-40.9 
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7: 27.8-28.0 


6: 
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28: 


18 
6: 
3: 
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12.3-12.5 
12.3-12.5 
12.3-12.5 
12.3-12.5 
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> 4.0-4.2 
56.3-56.5 
18.3-18.5 


0.75 
0.75 
0.75 
0.75 
0.75 
0.74 
0.74 
0.74 
0.74 
0.74 
0.74 
0.73 
0.73 
0.73 
0.73 
0.73 
0.73 
0.73 
0.72 
0.72 
0.72 
0.72 
0.72 
0.72 
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0.71 
0.71 
0.71 
0.71 
0.71 
0.71 
0.71 
0.71 
0.71 
0.71 
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0.7 
0.7 
0.7 
0.7 
0.7 
0.7 
0.7 
0.7 
0.7 
0.7 
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6.03 
6.03 
6.03 
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6.02 
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NSCAFG00000024805 
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ENSCAFG00000003354 
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VWC2 


RNF103-CHMP3 
KDM3A 
REEPI 


COG6 


FAM40B 
AHCYL2 


ACMSD 
CCNT2 
YSK4 
RAB3GAPI1 


MBP 
ZNF 236 


SF3B1 
COQ10B 


ASPE1 
MOB4 
RFTN2 


RAB GTPase activating 
protein I-like 


COLIIAI 
ENTPD1 
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FAMI72A 
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11 6: 28.0-28.2 
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Table S6. Summary of 36 autosomal candidate domestication regions (CDRs). Start 
and end of regions with significantly reduced average pooled heterozygosity, Hp, and/or 
increased fixation index, F'sy, shows the position of 36 CDRs. Individual CDRs are 
separated by thick horizontal lines. In two cases the Fsr analysis identifies a single 
coherent region, while the Hp analysis indicates two separate regions. For these regions 


a dashed horizontal line separates the individual Hp regions. Ensemble ids, gene 


descriptions and gene names of genes residing in CDRs are shown. The * indicates 


genes that reside in regions 100 Kb up- or downstream of the CDR. 


CDR chr Hp start end Fer start 


region 


region 


end 


ensemble gene id 


gene description 


gene name 


1 2 5817430 6317430 5517430 

2 1 2 49617430 

3 1 3 66617430 

4 1 4 83017430 

5 2 5 46500196 

6 3 6 18207515 

7 3 7 21507515 

8 3 8 34907515 

9 4 9 17700233 
4400023 4420023 

10 4 3 3 3 10 44000233 
2710792 2730792 

ul 6 4 4 4 
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protein 159 
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GP2 Precursor 
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ENSCAFG00000018025 receptor 139 GPRI39 
acyl-CoA synthetase 
medium-chain family 
= ENSCAFG00000023658 member 5 ACSMS5 
acyl-CoA synthetase 
medium-chain family 
* ENSCAFG00000024109 member 2A XM _536949.2 
RNA-binding region 
(RNP1, RRM) 
13 6 12 49907924 50507924 ENSCAFG00000019972 containing 3 RNPC3 
collagen, type XI, 
ENSCAFG00000019985 alpha 1 COL11A1 
14 6 13 56307924 56507924 
RAB GTPase 
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activating protein 1- 
ENSCAFG00000025140 ike XP_861321.1 
RAB GTPase 
activating protein 1- 
ENSCAFG00000025165 ike 
6 8 15 30702583 30902583 
FERM domain 
* ENSCAFG00000014633 containing 6 ERMD6 
eucine-rich repeats 
and immunoglobulin- 
7 0 16 5701010 5901010 ENSCAFG00000000311 ike domains 3 LRIG3 
similar to heat shock 
protein | 
8 0 6 6601010 7101010 17 6601010 7001010 ENSCAFG00000000319 (chaperonin) HSPD1 
4080008 4110008 
9 1 7 6 6 
SH3-domain GRB2- 
bd ENSCAFG00000001567 like 2 
5030008 5070008 
20 1 8 6 6 19 50300086 50700086 ENSCAFG00000001734 
5690008 5710008 FERM and PDZ 
21 1 9 6 6 ENSCAFG00000002386 domain containing | FRMPDI 
RNA (guanine-9-) 
methyltransferase 
ENSCAFG00000002389 domain containing 3 RGIMTD3 
polymerase (RNA) I 
7 ENSCAFG00000002377 polypeptide E, 53kDa POLRIE 
be ENSCAFG00000002382 F-box protein 10 FBXO10 
family with sequence 
similarity 40, 
22 14 20 10200337 10500337 ENSCAFG00000001515 member B FAM40B 
adenosylhomocystein 
ENSCAFG00000001524 ase-like 2 AHCYL2 
ribosomal protein 
ENSCAFG00000001528 L12 XM_856148.1 
smoothened homolog 
ENSCAFG00000001531 (Drosophila) SMO 
nuclear respiratory 
ha ENSCAFG00000001507 factor 1 NRFI 
23 15 21 8103479 8403479 
glutamate receptor, 
* ENSCAFG00000003333 ionotropic, kainate 3 GRIK3 
vezatin, adherens 
junctions 
transmembrane 
24 15 22 38203479 38403479 ENSCAFG0000000633 1 protein VEZT 
methionyl 
ENSCAFG00000006353 aminopeptidase 2 METAP2 
FYVE, RhoGEF and 
PH domain 
* ENSCAFG00000006273 containing 6 FGD6 
maltase- 
1010739 1030739 glucoamylase (alpha- 
25 16 10 1 1 23 9807391 10307391 ENSCAFG00000003841 glucosidase) MGAM 
Q2ABD2_CANF 
ENSCAFG00000003856 Taste receptor type 2 A 
C-type lectin domain 
ENSCAFG00000003864 family 5, member A CLECSA 
ENSCAFG00000003872 
cOR9A7 olfactory 
receptor family 9 
ENSCAFG00000003876 subfamily A-like XM_539875.1 
ENSCAFG00000003879 
23 9807391 10307391 ENSCAFG00000003811 
T cell receptor beta 
ENSCAFG00000003812 variable 9 
ENSCAFG000000038 14 
ENSCAFG00000003815 
ENSCAFG00000003817 
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Cationic trypsin 


Precursor (EC 
ENSCAFG00000003818 3.4.21.4) TRY1_CANFA 
ENSCAFG00000003820 similar to trypsin X5 XM_846299.1 
ENSCAFG00000003823 similar to trypsin X3 PRSS58 
similar to 
monooxygenase, 
ENSCAFG00000003827 DBH-like 2 XM_539871.2 
T cell receptor beta 
ENSCAFG00000014478 constant 2 
ENSCAFG00000024325 
ENSCAFG00000024754 XM_539869.2 
ENSCAFG00000024805 
ENSCAFG00000024808 
T cell receptor beta 
ENSCAFG00000024810 variable 2 
T cell receptor beta 
ENSCAFG00000024819 variable 19 
ENSCAFG00000025371 
similar to Maltase- 
glucoamylase, 
ENSCAFG00000025387 intestinal XM_539872.2 
T cell receptor beta 
* ENSCAFG00000024822 variable 28 TRBV250R9-2 
Anionic trypsin 
% ENSCAFG00000014481 Precursor TRY2_CANFA 
T cell receptor beta 
ki ENSCAFG00000014471 variable 29-1 
1190739 1210739 thromboxane A 
26 16 ll 1 1 ENSCAFG00000004011 synthase | (platelet) TBXAS1 


homeodomain 

interacting protein 
ENSCAFG00000004038 kinase 2 HIPK2 

poly (ADP-ribose) 

polymerase family, 


* ENSCAFG00000003997 member 12 PARP12 
vacuolar protein 
sorting 24 homolog RNF103- 
27 17 24 41722203 41922203 ENSCAFG00000007479 (S. cerevisiae) CHMP3 
lysine (K)-specific 
ENSCAFG00000007522 demethylase 3A. KDM3A 
receptor accessory 
ENSCAFG00000007549 protein 1 REEP1 
ring finger protein 
is ENSCAFG00000007472 103 RNF103 


von Willebrand factor 
C domain containing 


28 18 12 3604681 4304681 25 3404681 4404681 ENSCAFG00000003354 2 VWC2 
zona pellucida 
ENSCAFG00000003356 binding protein ZPBP. 
29 18 13 6204681 7304681 26 6204681 7704681 


aminocarboxymucon 
ate semialdehyde 
30 19 27 40704062 40904062 ENSCAFG00000005020 decarboxylase ACMSD 


ENSCAFG00000005031 cyclin T2 CCNT2 
YSK4 Sps1/Ste20- 
related kinase 


homolog (S. 
ENSCAFG00000005038 cerevisiae) YSK4 
RAB3 GTPase 
activating protein 
ENSCAFG00000005042 subunit | (catalytic) RAB3GAP1 
TBC1 domain family, 
ha ENSCAFG00000003662 member 9 
31 22 29 22941181 23141181 
32 25 14 4000488 4400488 30 4000488 4500488 ENSCAFG00000005968 


component of 
oligomeric golgi 
ENSCAFG00000005979 complex 6 COG6 
solute carrier family 
5 (sodium/glucose 
cotransporter), NP_001007142. 
33 26 31 27900108 28100108 ENSCAFG00000013439 member 1 1 
solute carrier family 
5 (low affinity 
glucose 
cotransporter), 
26 ENSCAFG00000013498 member 4 SLCSA4 
similar to Ig lambda 
chain V region 4A 
26 ENSCAFG00000023285 precursor LOC612066 
tyrosine 3- 
monooxygenase/trypt 
ophan 5- 
monooxygenase 
activation protein, eta 
he ENSCAFG00000013330 polypeptide YWHAH 
BTAF1 RNA 
polymerase II, B- 
34 28 32 9400594 9600594 ENSCAFG00000007355 TFIID transcription BTAF1 
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35 28 


36 37 


33 12200594 12500594 


* 


* 


35 9915022 10115022 
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ENSCAFG00000007423 


ENSCAFG00000007302 


ENSCAFG00000007314 


ENSCAFG00000008433 


ENSCAFG00000008439 


ENSCAFG00000008441 


ENSCAFG00000008444 


ENSCAFG00000008446 


ENSCAFG00000008454 


ENSCAFG00000025587 


ENSCAFG00000008339 


ENSCAFG00000008406 


ENSCAFG00000010826 


ENSCAFG00000010837 


ENSCAFG00000010865 


ENSCAFG00000010890 


ENSCAFG00000010899 


ENSCAFG00000010934 


factor-associated, 
170kDa 
cytoplasmic 
polyadenylation 
element binding 
protein 3 

tankyrase, TRF 1- 
interacting ankyrin- 
related ADP-ribose 
polymerase 2 
fibroblast growth 
factor binding protein 
3 


ectonucleoside 
triphosphate 
diphosphohydrolase 1 


cyclin J 


aldehyde 
dehydrogenase 18 
family, member Al 
tectonic family 
member 3 

splicing factor 3b, 
subunit 1, 155kDa 
coenzyme Q10 
homolog B 


heat shock 10kDa 
protein | (chaperonin 
10 

MOBI, Mps One 
Binder kinase 
activator-like 3 

raftlin family member 


CPEB3 


TNKS2 


FGFBP3 


ENTPDI 


CCNJ 


ALDHI8A1 
TCIN3 
SF3B1 


COQI0B 


HSPE1 


MOB4 


RFTN2 


Table S7. Significantly overrepresented Gene Ontology terms (GO-terms) among 


genes residing in CDRs. 


Ensembl id 


ENSCAFG00000013330 
ENSCAFG00000003354 


NSCAFG00000016919 
NSCAFG00000013439 
NSCAFG00000008433 
NSCAFG00000014481 
NSCAFG00000010826 
NSCAFG00000016903 
NSCAFG00000000015 
NSCAFG00000014481 
NSCAFG00000000016 
NSCAFG00000009720 
NSCAFG00000013330 
NSCAFG00000003354 
NSCAFG00000016919 
NSCAFG00000019985 
NSCAFG00000007472 
NSCAFG00000003872 


eyes ieee eomesiesiesmeomesmesmesmesmesmesmess ies! 


Grou 


p Total 
count 


Genes count 


YWHAH 
VWC2 


TLX3 3 26 
SLCSAI1 
ENTPD1 
PRSS3 
SF3B1 
FGFI8 
GALRI 
PRSS1I 
MBP 
CYFIP1 
YWHAH 
VWC2 
TLX3 
COLIIAI 
RNF103 
OR9A4 


Prepr- 
value 


0.005 


GO term 


regulation of 
neuron 


differentiation 
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NSCAFG00000001531 
NSCAFG00000004011 
NSCAFG00000003856 
NSCAFG00000005968 
NSCAFG00000001567 


epmesmesmesies! 


NSCAFG00000017807 
NSCAFG00000014481 
NSCAFG00000013439 
NSCAFG00000014481 
NSCAFG00000000015 
NSCAFG00000009720 
NSCAFG00000013330 
NSCAFG00000003354 
NSCAFG00000016919 


jepmesiesiest fesmesmesiest les! 


NSCAFG00000001531 
NSCAFG00000003662 
NSCAFG00000010890 
NSCAFG00000001531 
NSCAFG00000000015 
NSCAFG00000025 165 
NSCAFG00000005042 
NSCAFG00000006273 


E 
E 
E 
E 
E 
E 
E 
E 


NSCAFG0000000503 1 
NSCAFG00000007472 
NSCAFG00000000016 
NSCAFG00000016919 
NSCAFG00000001567 


epmesimesimesy pes! 


ENSCAFG00000001531 
ENSCAFG00000009720 
ENSCAFG00000013330 
ENSCAFG00000003354 
ENSCAFG00000016919 


ENSCAFG00000006273 
ENSCAFG00000009720 
ENSCAFG00000013330 
ENSCAFG00000003354 
ENSCAFG00000016919 


NSCAFG00000001531 
NSCAFG00000000016 
NSCAFG00000009720 
NSCAFG00000007472 
NSCAFG00000003354 
NSCAFG00000013330 
NSCAFG00000016919 
NSCAFG00000001567 


E 
E 
E 
E 
E 
E 
E 
E 


ies! 


NSCAFG00000001531 
NSCAFG00000003356 


ies| 


ies| 


NSCAFG00000017814 
ENSCAFG00000003356 


SMO 
TBXASI1 
TAS2R38 
FABPS 
SH3GL2 


CRYM 21 3822 
PRSS1I 

SLCSA1 

PRSS3 

GALRI 4 95 
CYFIP1 

YWHAH 

VWC2 

TLX3 


SMO 5 210 
TBCID9 

ASPE1 

SMO 

GALRI 

RABGAPIL 

RAB3GAPI1 

FGD6 


CCNT2 8 671 
RNF103 

MBP 

TLX3 

SH3GL2 


SMO 5 235 
CYFIP1 

YWHAH 

VWC2 

TLX3 


FGD6 5 236 


SMO 5 242 
MBP 

CYFIP1 

RNF103 

VWC2 

YWHAH 

TLX3 

SH3GL2 


SMO 8 716 
ZPBP 


ZP2 2 12 
ZPBP 
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0.005 


0.008 


0.010 


0.011 


0.013 


0.013 


0.013 


0.013 


0.015 


multicellular 
organismal 
process 


digestion 


neuron 
differentiation 


regulation of a 
molecular 
function 


central nervous 
system 
development 


regulation of 
developmental 
process 


generation of 
neurons 


nervous system 
development 


binding of sperm 
to zona pellucida 
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sperm-egg 
ENSCAFG00000017814 ZP2 2 12 0.015 recognition 
ENSCAFG00000009720 CYFIP1 
ENSCAFG00000013330 YWHAH 
ENSCAFG00000003354 VWC2 
ENSCAFG00000016919 TLX3 
ENSCAFG00000001531 SMO 5 262 0.015 neurogenesis 
ENSCAFG00000003356 ZPBP 

cell-cell 
ENSCAFG00000017814 ZP2 2 14 0.019 recognition 
ENSCAFG00000025 165 RABGAPIL 
ENSCAFG00000003662 TBCID9 
ENSCAFG00000010890 ASPE1 
ENSCAFG00000005042 RAB3GAPI1 
ENSCAFG00000006273 FGD6 
ENSCAFG0000000503 1 CCNT2 

regulation of 
ENSCAFG00000000015 GALRI as 605 0.020 catalytic activity 
ENSCAFG00000025 165 RABGAPIL 
ENSCAFG00000003 662 TBCID9 
ENSCAFG00000010890 ASPE1 
ENSCAFG00000005042 RAB3GAP1 

regulation of 
ENSCAFG00000006273 FGD6 5 307 0.026 hydrolase activity 

NSCAFG00000024109 ACSM2A 


E 
ENSCAFG00000023658 ACSM5 
ENSCAFG00000024109 ACSM2B 


fatty acid 
ENSCAFG00000004011 TBXASI 4 191 0.031 metabolic process 
ENSCAFG00000007472 RNF103 
ENSCAFG00000001531 SMO 
ENSCAFG00000016903 FGFI8 
ENSCAFG00000005968 FABPS5 
ENSCAFG00000000016 MBP 
ENSCAFG00000009720 CYFIP1 
ENSCAFG00000013330 YWHAH 
ENSCAFG00000003354 VWC2 
ENSCAFG00000016919 TLX3 
ENSCAFG00000001567 SH3GL2 
system 
ENSCAFG00000019985 COLIIAI 11 1605 0.034 development 
ENSCAFG00000025 165 RABGAPIL 
ENSCAFG00000003662 TBCID9 
ENSCAFG00000005042 RAB3GAPI1 
regulation of 
ENSCAFG00000006273 FGD6 4 211 0.039 GTPase activity 
ENSCAFG00000007472 RNF103 
ENSCAFG00000001531 SMO 
ENSCAFG00000016903 FGFI8 
ENSCAFG00000005968 FABPS5 
ENSCAFG00000000016 MBP 
ENSCAFG00000009720 CYFIP1 
ENSCAFG00000003354 VWC2 
ENSCAFG00000013330 YWHAH 
ENSCAFG00000006273 FGD6 
ENSCAFG00000016919 TLX3 
ENSCAFG00000001567 SH3GL2 
anatomical 
structure 
ENSCAFG00000019985 COLIIAI 12 2005 0.039 development 


ENSCAFG00000016903 FGFIS& 1 1 0.039 intramembranous 
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————— ee ae 

quinolinate 

ENSCAFG00000005020 ACMSD 1 1 0.039 metabolic process 
starch metabolic 

ENSCAFG00000003841 MGAM 1 1 0.039 process 
starch catabolic 

ENSCAFG00000003841 MGAM 1 1 0.039 process 
glucocorticoid 

ENSCAFG00000013330 YWHAH 1 1 0.039 catabolic process 

ENSCAFG00000010890 ASPE1 

ENSCAFG00000004038 HIPK2 

ENSCAFG00000001531 SMO 

ENSCAFG00000016903 FGFI8 

ENSCAFG00000009720 CYFIP1 

ENSCAFG00000008406 TCTN3 

ENSCAFG00000013330 YWHAH 

ENSCAFG00000003354 VWC2 

ENSCAFG00000016919 TLX3 9 1242 0.039 cell development 


Table S8. Nervous system development genes classified by GO-analysis. 


Position 
(Chrom:Mb) Genename _ Function 
1: 5.8-6.3 MBP Insulates axons 
3: 34.9-35.1 CYFIPI Regulates synaptic plasticity and brain development 
4: 44-442 TLX3 Determine excitatory over inhibitory cell fates in dorsal spinal cord 
11: 40.8-41.1 SH3GL2 Regulates Neurotransmitter Release and Short-Term Plasticity 
14: 10.2-10.5 SMO Mediates signals from hedgehog protein 
17: 41.7-41.9 RNF103 Nervous system development 
18: 3.6-4.3 VWwC2 Promotes neurogenesis 


26: 27.9-28.1 YWHAH Nervous system development 
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Table S9. Additional CDR genes with a function in the central nervous system. 


Position (Chrom:Mb) Gene name 


— 


> 5,5-5,7 
> 49,6-49,9 
: 66,6-66,8 


> 27,1-27,3 
: 28-28,3 
: 30,7-30,9 
15: 8,1-8,4 
15: 38,2-38,4 


Se 


on an 


16: 11,9-12,1 
19: 40,7-40,9 
28: 12,2-12,5 


GALRI 
ARID1B 
NKAIN2 


CRYM 
GPRI139 


FRMD6 
GRIK3 


VEZT 


HIPK2 
ACMSD 


TCTN3 


Function 

Modulation of action potentials 

Neural development 

Neural development 

Binds thyroid hormone for possible regulatory or 
developmental roles. 

Specifically expressed in brain 

Regulating cell contact inhibition, organ size control 
etc. 

Excitatory neurotransmitter receptor 

Regulates the dendritic formation of hippocampal 
neurons 

Regulates Postnatal Development of Enteric 
Dopaminergic Neurons and Glia 

Synaptic plasticity and neurodegeneration 
Homologous to a brain photoreceptor mediating 
photoperiodic response in birds 


Table S10. Genes in CDRs with function related to starch digestion, glucose uptake 


and storage. 


Position 
(Chrom:Mb) 
6: 49.9-50.5 
16: 10.1-10.3 
26: 27.9-28.1 
6: 28-28.3 
26: 27.9-28.1 
16: 10.1-10.3 
6: 28-28.3 

6: 28-28.3 
15: 38.2-38.4 
25: 4-4.4 


Gene 
AMY2B 
MGAM 
SGLT1 
GP2 
SGLT3 
TAS2R38 
ACSMS5 
ACSM2B 
METAP2 
FABP5 


Function 

Starch digestion 

Starch digestion 

Glucose uptake 

Pancreatic digestive enzyme storage 
Sugar sensor 

Perception of bitterness 
Fatty acid metabolism 

Fatty acid metabolism 

Fat metabolism and appetite 
Fatty acid metabolism 
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Table S11. Amylase copy numbers in 136 dogs and 35 wolves quantified using real 
time PCR. Mean estimates from three replicates are reported. 


DOGS Copy number WOLVES Copy number 
Swedish Lapphound 1 8.1 22799, Belarus 1.6 
Swedish Lapphound 2 15.1 22801, Russia 2.0 

American Staffordshire Terrier | 11.5 22802, Russia 1.7 

American Staffordshire Terrier 2 18.2 22803, Russia 1.7 

Beagle 1 8.6 22804, Bulgaria 1.8 

Bearded Collie 1 23.0 22807, Spain 1.7 

Bearded Collie 2 10.2 22808, Spain 1.8 

Bearded Collie 3 12.5 22809, Spain 1.6 

Beddlington Terrier | 15.3 22810, Spain 1.8 

Belgian Shepard 1 9.7 22800, Belarus 1.7 

Belgian Shepard 2 7.3 10w, Canada 2.4 

Belgian Tervern 1 12.0 llw, USA 2.1 

Bichon Frise 1 16.2 Scandinavian Wolf 1 2.2 

Border Collie 1 6.9 Scandinavian Wolf 17 2.3 

Border Collie 2 7 Scandinavian Wolf 19 1.5 

Border Colliel10 9.4 Scandinavian Wolf 2 22 

Border Colliel 1 21.3 Scandinavian Wolf 20 2.2 

Border Collie12 17.0 Scandinavian Wolf 21 2.2 

Border Collie13 17.6 Scandinavian Wolf 22 1.6 

Border Collie14 17.6 Scandinavian Wolf 23 1.6 

Border Collie15 14.8 Scandinavian Wolf 24 2.2 

Border Colliel6 13.3 Scandinavian wolf 25 22, 

Border Collie3 16.5 Scandinavian wolf 26 2.4 

Border Collie4 15.1 Scandinavian Wolf 27 2.0 

Border Collie5 15.4 Scandinavian Wolf 28 2.2 

Border Collie6 13.2 Scandinavian Wolf 29 2.0 

Border Collie7 21.1 Scandinavian Wolf 3 2.0 

Border Collie9 14.7 Scandinavian Wolf 30 2.2 

Border Terrier 1 11.7 Scandinavian Wolf 31 2.0 

Border Terrier 2 13.1 Scandinavian Wolf 4 2.0 

Boxer | 14.8 Scandinavian Wolf 5 2.0 

Boxer | 10.3 Scandinavian Wolf 5 2.4 

Boxer 2 15.9 Scandinavian Wolf 6 1.8 

Boxer 2 10.8 Scandinavian Wolf 7 22: 

Boxer 3 15.2 Scandinavian Wolf 18 2.1 

Bullterrier 1 12.4 Average 2.0 

Bullterrier 2 10.8 

Chinese Crested 1 11.1 

Chinese Crested 2 9.6 

Cocker Spaniel | 12.4 

Cocker Spaniel 3 10.2 

Collie 1 22.8 

Dalmation 1 9.8 

English Springer Spaniel 1 11.0 

English Springer Spaniel 2 19.9 

English Springer Spaniel 3 23.0 

English Springer Spaniel 4 14.2 

Eurasier 1 15.3 

Field Spaniel 1 15.7 

German Shepard 1 14.8 
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German Shepard 2 20.0 
German Shepard 4 27.3 
German Shepard 5 16.9 
Golden Retriever 1 17.3 
Golden Retriever 2 26.1 
Golden retriever 3 21.2 
Great Dane 1 17.0 
Great Dane 2 12.3 
Great Dane 3 12.2 
Howawart | 11.1 
Howawart 2 15.6 
Howawart 3 15.3 
Irish Wolfhound 1 13.3 
Irish Wolfhound 2 12.0 
Irish Wolfhound 3 11.6 
Karelian Beardog 1 30.3 
Karelian Beardog 2 16.7 
Labrador 1 16.2 
Labrador 2 12.4 
Labrador 3 13.9 
Labrador 4 15.2 
Labrador 5 12.7 
Labrador 6 16.7 
Labrador 7 16.7 
Leonberger | 19.0 
Leonberger 2 21.9 
Leonberger 3 25.0 
Lowchen 1 17.6 
Miniature Schnauzer | 17.4 
Miniature Schnauzer 2 24.4 
Newfoundland 1 13.3 
Newfoundland 2 18.6 
Nova Scotia Duck Trolling Retriever 1 = 12.5 
Nova Scotia Duck trolling retriever 2 11.7 
Nova Scotia Duck trolling retriever 3 9.4 

Nova Scotia Duck trolling retriever 4 11.7 
Papillion 1 17.4 
Papillion 2 15.3 
Polish Lowland Sheepdog 1 15.1 
Polish Lowland Sheepdog 2 11.2 
Polish Lowland Sheepdog 3 15.8 
Poodle 1 8.0 

Poodle 2 15.6 
Poodle 3 8.7 

Poodle 4 22.9 
Poodle 5 9.5 

Portuguese Waterdog | 12.9 
Pug 2 5.1 

Pugl 8.0 

Pumi | 9.4 

Pumi 2 24.0 
Giant Schnauzer 1 16.2 
Giant Schnauzer 2 8.7 

Rottweiler | 11.4 
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Rottweiler 2 23.9 
Rottweiler 3 20.3 
Samoyed 1| 8.6 

Samoyed 2 4.5 

Shar-Pei 1 75 

Shar Pei 2 9.5 

Sheltie 1 12.6 
Smaland Hound 1 18.5 
Smaland Hound 2 12.2 
Smaland Hound 3 17.9 
Swedish Elkhound 1 16.9 
Swedish Elkhound 10 14.1 
Swedish Elkhound 17 17.1 
Swedish Elkhound 18 13.8 
Swedish Elkhound 3 17.5 
Swedish Elkhound] 1 18.7 
Swedish Elkhound12 16.4 
Swedish Elkhound13 15.0 
Swedish Elkhound14 15.2 
Swedish Elkhound15 14.9 
Swedish Elkhound16 12.5 
Swedish Elkhound4 15.7 
Swedish Elkhound5 14.0 
Swedish Elkhound7 9.3 

Swedish Elkhound8 18.4 
Swedish Elkhound9 14.5 
Swedish Vallhund 1 19.5 
Swedish Vallhund 2 12.2 
Tibetan Spaniel | 7.0 

West Highland White Terrier | 12.2 
West Highland White Terrier 2 8.2 

West Siberian Laika 25.4 
Average 14.7 
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Table S12. Amylase activity in fresh serum from 6 dogs and 6 wolves. 


Amylase activity 


Wolves (ukat/L) 
Jarvs6 zoo | 3.6 
Jarvs6 zoo 2 3.3 
Jarvs6 zoo 3 2.8 
Jarvs6 zoo 4 4.1 
Jarvs6 zoo 5 3.8 
Jarvs6 zoo 6 3.2 
Average 3.5 
Dogs 

Bernese Mountain dog 17 
Irish Setter 10.6 
Mixed breed 15.1 
Bichon Frisé 15.1 
Boxer 16 
English Springer Spaniel 9.3 
Average 13.85 
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Table S13. Amylase activity from serum of 6 wolves. Note that these samples were 
assayed using a different instrument (VetScan® (Abaxis Inc., USA)) compared to the 
other measurements. Range of dog reference values is noted for comparison. 


Amylase Dog reference 
activity 
Wolves (IU/L) 
Nordens ark zoo 176 
Nordens ark zoo 197 
Nordens ark zoo 193 
Nordens ark zoo 167 
Nordens ark zoo 54 
Nordens ark zoo 198 
Nordens ark zoo 171 
Nordens ark zoo =. 223 
Average 172.4 
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Table S14. Canine reference panel. Panel of 74 dogs representing 38 diverse breeds 
and 21 wolves of wide geographical origin used for the iPLEX genotyping assay. 


Total 


Dog breed 


oe) 
S 
= 
5 
= 


Polish Lowland 
American Staffordshire terrier 
Karelsk Bjérnhund 
Pumi 

Golden Retriever 
Bearded Collie 
Border Collie 
Smalandsstévere 
English Springer Spaniel 
German Shepard 
Samoyed 

Pug 

Miniature Schnauzers 
Swedishk Lapphound 
Poodle 

Elkhound 

Hovawart 

Shar-Pei 

Toller 

West Highland White Terrier 
Papillion 

Bichon Frisé 
Vastgotaspets 

Cocker Spaniel 

Irish Wolfhound 
Boxer 

Belgian Tervuren 
Rottweiler 

Gran Dane 

Bull Terrier 

Chinese Crested 
Boston Terrier 
Léwchen 

Labrador 

Leonberger 

Sheltie 

Giant Schnauzer 
Dalmation 
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Wolf origin 


74 


Sweden 
Spain 
Russia 
Belarus 
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Bulgaria 

USA 

Canada 
Copenhagen Zoo 
Nordens Ark Zoo 


ee a eee 


Total 21 


Table S15. Candidate causative mutations in MGAM and SGLT1. Alleles, allele 
frequencies (freq.) and number of chromosomes (sequencing reads) sampled (cov.) for 
five candidate causative mutations residing in MGAM and SGLT/ are shown for wolf 
and dog. Results are from the resequencing data and genotyping assay (when available). 


RESEQUENCING DATA GENOTYPING DATA 
WOLF DOG WOLF DOG 

chr. position allele freq. cov. | allele freq. cov. | allele freq. cov. | allele freq. cov. 
16 10103702 | CA 1 6] - 1 50 | - = = “ 2 z 

16 10117660 | A 0.57 7|G 1 53 | - 2 2 “ = 2 

16 =10135196 | T 1 11] C 1 54 | T 1 40 | Cc 0.92 142 
16 = 10143343 | T 0.67 9|T 1 27 |T 0.68 40 | T 1 142 
26 27964111 | A 0.83 6/G 0.98 41} A 0.98 40 |G 0.94 142 


Table S16. MGAM non synonymous candidate causative mutation (chr16: 


10,135,196). Wolf, the omnivorous rat and the insectivorous hedgehog and short tailed 


opossum lack valine at MGAM residue 1001. 


Dog ASSSPGVPFCYFVNDLYSVSDVOYDSHGATATISLKSSVYASALPSVPVTSL 
Wolf ASSSPGVPFCYFVNDLYSVSDEOYDSHGATATISLKSSVYASALPSVPVTSL 
Human ASNSSGVPFCYFVNDLYSVSDVOYNSHGATADISLKSSVYANAFPSTPVNPL 
Chimpanzee ASNSSGVPFCYFVNDLYSVSDVOYNSHGATADISLKSSIYANAFPSTPVNPL 
Orangutan ASNSSGVPFCYFVNDLYSVSDVOYNSHGATADISLKSSVYASAFPSTPVNPL 
Macaque ASNSSGVPFCYFVNDLYSVSNVOYSSHGATADISLKSSVYANAFPSTPVNPL 
Baboon ASNSSGVPFCYFVNDLYSVSNV -------- ee 
Mouse ESNTIGVPTCYFAHELYSVSNVOYDSHGATADISLKASTYSNAFPSTPVNKL 
Rat VSNTPGVPHCYFANELYSVSNBOYNSHGATADIFLKASTYSNAFPSTPVNOL 
Guinea pig ESSTTGVPFCYFVTDLYSVSNVOYDSQGASADISLKSSSYANAFPSTPVSPL 
Rabbit ESASPGVPFCYFVNDLYSVSNVOYNSDGATADISLKSSVEANAFPSTPVNPL 
Horse ESSSPGVPFCYFVSDLYSVSDVOYDTHGATAVISLNSSPYAYALPSIPVNSL 
Hedgehog VSTIDRVPHCYFVKDLYSVSDBOYNSNGASAVISLSSSLYANAFPSTPVNPL 
Elephant ESSISGVPFCYFVSDLYSVD YKADGATADISLKTGVYADAFPSTPVTSL 


Opossum 


LSNSPGVPNCYVINHLYSVSS 


YNPTGITADIFLNSPVRASAGLSTPVNPL 
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Table S17. MGAM non synonymous candidate causative mutation (chr16: 
10,143,343). Wolf and the insectivorous shrew and hedgehog lack methionine at 


MGAM residue 797. 

Dog GARARWRKORVEMGLPADKIGLHLRGGHIFPTOQPATTTVAS 
Wolf GARARWRKORVEMGLPADKIGLHLRGGHIFPTOOPATTTVAS 
Human GSOVRWRKOKVEMELPGDKIGLHLRGGYIFPTOQPNTTTLAS 
Chimpanzee GSQVRWRKQKVEMELPGDKIGLHLRGGYIFPTQQPNTTTLAS 
Orangutan GSQVRWRKQKVEMELPGDKIGLHLRGGYIFPTQQPNTTTLAS 
Macaque GNOVRWRKOKVEMELPGDKIGLHLRGGYIFPTOQPNTTTLAS 
Baboon GNOVRWRKOKVEMELPGDKIGLHLRGGYIFPTOOPNTTTLAS 
Mouse GEELGWRKOSIEMOLPGDKIGLHLRGGYIFPTOOQPATTTEAS 
Rat GEOQLAWRKOSVEMELPEDKIGLHLRGGYIFPTOOQPATTTEAS 
Guinea pig GGQLGWRKQNIEMELPGDKIGLHLRGGYIFPIQQPSTTTVAS 
Cow eee WRKQFVEMLLPGDRIGLHLRGGYIFPIQQOPNTTTETS 
Horse GGRVRWRKQOVEMDLPGDKIGLHLRGGYIFPTOQPATTTVAS 
Cat GARTRWRKORVEMELPGDKIGLHLRGGHVFPTOQPATTTVVS 
Bat GSOLRWRKOKVEMOLPGDKIGLHLRGGYIFPTOQPATTTVA- 
Hedgehog GAKMNWRGNKVEBOLPKDKIGLHFRGGYIFPIQEPAMTTVAS 
Shrew GAOLNWRGNKD-BMLPKDKIGLHLRGGYIFPTOQPATTTVAS 
Elephant GARIRWRKQOVEMELPGDKIGLHLRGGYIFPTOEPSTTTEAS 
Sloth GGOIRWRKOKVEMLLPGDKIGLHLRGGYIFPTOQPATTTVL- 
Opossum GGOIPERKOQOVEMLFSPEQIGLHLRGGYIFPIQQPAITTVAS 


Table S18. MGAM candidate causative C-terminal deletion disrupts a stop codon 
and thereby extends dog MGAM by two amino acids (chr16: 10,103,702) relative 
to in wolf. Blue capital letters refer to coding sequence. Black, small letters denote 3’ 
UTR. Omnivorous mouse lemur and rat, as well as herbivorous rabbit, pika, alpaca and 
cow share a similar extension of MGAM as seen in dog. 


Dog GAGA-AGCACTC--TGAATTTTTAGagc 
Wolf GAGA-AGCACTCCATGAatttttagagc 
Human GATA-AGCACTCTGTGAatttttacagce 
Chimp GATA-ATCACTCTGTGAatttttacage 
Gorilla GATA-AGCACTCTGTGAatttttacagc 
Orangutan GATA-AGCACTCTGTGAatttttacagc 
Rhesus GAGA-AGCACTCCGTGAatttttagagce 
Baboon GAGA-AGCACTCCGTGAatttttagagce 
Marmoset GAGA-AGCACTCCTTGAattttcagagc 
Tarsier GAAA-AGCACTCTGTGAatttttaaaac 
Mouse lemur GAGA-AGCACTC=--TGAATTTTTAGagc 
Bushbaby GA-A-TACTCTG 


Tree shrew 
Rat 


Kangaroo rat 


Mouse 
Guinea pig 
Squirrel 
Rabbit 
Pika 
Alpaca 
Dolphin 
Cow 
Microbat 
Horse 
Megabat 
Shrew 
Hedgehog 
Elephant 
Rock hyrax 


GGGA-AGCAGCC 
GAGCTAGCTCTT 
GAAT-ATCTCTC 
G----AGCTCTT 
GAAA-AGCTCTC 
GAAA-AGCTCTC 
GAGA-AGCACA- 
GAGA-AGCACT-— 
GAGA-AGCCCTC 
G-AG-AGCACTC 
A-GA-AGCACGC 
GAAA-AGCACTC 
GAGC-AGCACTC 
GAGA-AGCACTC 
GTAA-AACACTC 
GACA-AACAACC 
GGCA-AGCACTC 
GACA-AGCACTC 


TAATGTTTTAAagc 
TAAaactttagtgt 
TAAt-ttttaaage 
TAAatttttaggac 
TAAatttttagagc 
CGAGCTCTCAGCGC 
CGGATTCTTAGtac 
TGAGTTCCTAGagc 
TGAatttttagage 
AA--TTTTAGagc 
TGAattttaagagc 
TGAatttttagagce 
TGAattttaagaac 
CAGacttttaaage 
TGAactgctagagc 
TGAatactcagage 
TGAattccaagagc 


WWW.NATURE.COM/ NATURE | 59 


doi:10.1038/nature11837 


Tenrec 
Sloth 
Wallaby 
Opossum 


GACA-GGCACTCTGTGAacgtaaagagc 
GAGC-AGCACTCTGTGAatcttcagagt 
GCAG-AGTTCCTCATGAccatttggagc 
GTGA-AGTTTCCAATGGccctttgaage 


ayaem SUPPLEMENTARY INFORMATION 


Table S19. MGAM candidate causative mutation affecting NR4A2 transcription 
factor binding site (chr16: 10,117,660). Wolf share a T with 4 other mammals at this 
site: the mainly insectivorous tarsier and marmoset, the fish eating dolphin and the 
primarily seed eating kangaroo rat. Dog and 21 other mammals share a C, as in the 
canonical binding motif of the NR4A2 protein. 


Canonical NR4A2 binding motif 


Human TGA 
Common marmoset TGT 
Macaque --- 
Orangutan TGA 
Gorilla TGA 
Chimapnzee TGA 
Grey mouse lemur T-- 
Philippine tarsier TTA 
Rat TGA 
Kangaroo Rat TGA 
Guinea pig TGA 
Rabbit TGA 
Pika TGA 
Horse TGA 
Micro bat TGA 
Large flying fox CGA 
Cow CGA 
Dolphin TTA 
Lama -GA 
Common shrew TGA 
Thirteen lined ground squirrel TTA 
Rock hyrax TGC 
Hoffmann’s two-toed sloth TGA 
Platypus -GG 
Mouse TGA 
Elephant TGC 
Hedgehog TCA 


Table S20. TaqMan realtime PCR primers and probe for the AMYB2 CNV 


quantification. 
Primers 
AMYLASE-5' 
AMYLASE-3' 


Probe 


CCAAACCTGGACGGACATCT 
TATCGTTCGCATTCAAGAGCAA 


Amylase: 6FAM - TTT GAG TGG CGC TGG G — MGBNFQ 
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Table S21. Real time PCR primers for mRNA expression analyses. 


Primers: 

MGAM-5 ' GGTTGCTTTGGATGATGAGG 
MGAM-3 ' AATGGAAACACTGCCCACTC 
Amy1-5' CTGGTGGGATAATGGTAGCAA 
Amy1l-3' GAAAAATGAGCATTCCCATCC 
SGLT1-5' TGCCAGTAACATTGGGAGTG 
SGLT1-3' GGTAGATCTGGATTCGCTTGC 


Table S22. Comparison of patterns of genetic variation in CDRs with the genomic 
average using genotyping data from the Illumina 170K Canine HD-array. The SNP 
density (SNPs/Kb), the minor allele frequency in dogs (MAF dog), the minor allele 
frequency in wolf (MAF wolf) and the fixation index between dog and wolf (Fsr) is 
stated for 36 CDRs and for the whole genome (genome). 


SNPs/ MAF 
CDRid_— Chr. Start End Length SNPs Kb MAF dog wolf Fsr__ 

genome genome genome genome 1897311618 165542 0.09 0.24 0.29 0.13 
1. 4 5517430 6317430 800001 46 0.06 0.08 0.04 0.21 
2 1 49617430 49917430 300001 19 0.06 0.16 0.18 0.23 
3 1 66617430 66817430 200001 11 0.05 0.08 0.28 0.30 
4 1 83017430 83217430 200001 10 0.05 0.07 0.16 0.24 
5S 2 46500196 46700196 200001 8 0.04 0.01 0.02 0.04 
6 3 18207515 18507515 300001 26 0.09 0.09 0.57 0.49 
1. 3 21507515 21707515 200001 9 0.04 0.04 0.10 0.12 
8 3 34907515 35107515 200001 7 0.03 0.07 0.02 0.09 
9 4 17700233 17900233 200001 12 0.06 0.11 0.17 0.12 
10 4 44000233 44200233 200001 12 0.06 0.06 0.14 0.35 
ll 6 27107924 27307924 200001 11 0.05 0.04 0.14 0.32 
12 6 28007924 28307924 300001 10 0.03 0.03 0.17 0.58 
13 6 49907924 50507924 600001 36 0.06 0.09 0.44 0.50 
14 6 56307924 56507924 200001 9 0.04 0.02 0.31 0.77 
15 7 27600316 28100316 500001 14 0.03 0.01 0.02 0.08 
16 8 30702583 30902583 200001 8 0.04 0.12 0.29 0.29 
17 10 5701010 5901010 200001 8 0.04 0.03 0.05 0.13 
18 10 6601010 7101010 500001 23 0.05 0.11 0.12 0.26 
19 11 40800086 41100086 300001 16 0.05 0.01 0.00 0.03 
20 11 50300086 50700086 400001 15 0.04 0.03 0.11 0.25 
21 11 56900086 57100086 200001 5 0.02 0.02 0.21 0.32 
22. «+14 10200337 10500337 300001 22 0.07 0.10 0.46 0.57 
23 15 8103479 8403479 300001 13 0.04 0.06 0.12 0.34 
24 15 38203479 38403479 200001 8 0.04 0.06 0.00 0.14 
25 16 9807391 10307391 500001 19 0.04 0.07 0.18 0.37 
26 16 11907391 12107391 200001 8 0.04 0.06 0.03 0.05 
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27 17 41722203 41922203 200001 8 0.04 0.13 0.20 0.12 
28 «18 3404681 4404681 1000001 47 0.05 0.09 0.15 0.38 
29 18 6204681 770468 1 1500001 75 0.05 0.05 0.14 0.45 
30 619 40704062 40904062 200001 12 0.06 0.23 0.25 0.18 
31 22 22941181 23141181 200001 14 0.07 0.12 0.42 0.46 
32-25 4000488 4500488 500001 26 0.05 0.07 0.22. 0.57 
33 26 27900108 28100108 200001 9 0.04 0.25 0.40 0.35 
34-28 9400594 9600594 200001 0 0.00 NA NA 0.00 
35 28 12200594 12500594 300001 0 0.00 NA NA 0.00 
3637 9915022 10115022 200001 0 0.00 NA NA _ 0.00 
CDR 
average 0.04 0.08 0.19 0.27 


Table S23. Candidate domestication regions on chromosome X. Start and end of 
regions with significantly increased (Z(Fsr)>3) fixation index, Fsr, shows the position 
of 6 CDRs. Individual CDRs are separated by horizontal lines. Ensemble IDs, gene 
descriptions and gene names of genes residing in CDRs are shown. 


Fer 
Chr region start end For ZF sr Ensemble ID Gene description Gene 
eee 


Xx 1 39105187 39505187 0.86 3.1 ENSCAFG00000014691 HNRNAI 
x 2 42905187 43105187 0.96 3.6 ENSCAFG00000015996 G2/mitotic-specific cyclin-B3 CCNB3 
xX 2 ENSCAFG00000016002 RBM3 
xX 2 ENSCAFG00000016018 DGKK 
xX 3 81505187 81705187 0.91 Bis 

x 4 111005187 111205187 0.85 3.0 

xX 5 111505187 112005187 0.88 3.2 ENSCAFG00000018988 MAGEC3 
x 5 ENSCAFG00000018995 fibroblast growth factor 13 FGF13 
xX 6 112205187 112605187 0.85 3.0 ENSCAFG00000018998 Coagulation factor IX Precursor FA9_CANFA 
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Table S24. Ranking of 200 Kb windows in the wolf genome based on a significantly 
reduced average pooled heterozygosity, Hp, sorted by Z-sore. Wolf CSR indicates 
which candidate selection region the window is part of. Z-score refers to the value of 
the window after Z-transformation of the Hp distribution. Ensemble ID and gene name 


or gene description is shown for genes residing in these windows. 


wolf Position 
CSR (Chrom:Mb) Hp Z-score Ensemble ID Gene 

1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000014270 CRAMPIL 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019486 Cl6orf73 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019488 FAHD! 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019490 HAGH 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019493 IGFALS 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019494 NUBP2 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019495 SPSB3 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019496 MRPS34 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019497 NME3 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019513 MAPKS8IP3 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019540 ANIL 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000019542 XM_547195.2 
1 6: 42.1-42.3 0.013 -6.34 ENSCAFG00000023861 EME2 

15 24: 5.9-6.1 0.022 -6.13 ENSCAFG00000005251 RALGAPA2 

15 24: 5.9-6.1 0.022 -6.13  ENSCAFG00000005264 INSM1 

15 24: 5.9-6.1 0.022 -6.13  ENSCAFG00000005270 C20orf26 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019472 XM_859963.1 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019476 NDUFB10 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019480 RPL3L 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019484 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019486 Cl6orf73 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019488 FAHD! 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019490 HAGH 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019493 IGFALS 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019494 NUBP2 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019495 SPSB3 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019496 MRPS34 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019497 NME3 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000019513 MAPKS8IP3 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000023861 EME2 
1 6: 42-42.2 0.025 -6.06 ENSCAFG00000024261 HS3ST6 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019392 RAB26 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019397 RNPS1 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019412 CASKINI1 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019418 TRAF7 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019424 O8MJF3_CANFA 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019430 TBL3 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019438 O9XSYS8 CANFA 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019439 NTHL1 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019449 SLC9A3R2 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019456 ZNF598 
1 6: 41.8-42 0.032 -5.9 ENSCAFG00000019459 SYNGR3 
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NSCAFG00000005242 
NSCAFG00000005256 
NSCAFG00000004018 
NSCAFG00000025022 
NSCAFG00000001426 
NSCAFG00000001450 
NSCAFG00000001459 
NSCAFG00000000484 
NSCAFG00000000488 


NSCAFG00000015363 
NSCAFG00000015429 
NSCAFG00000019675 
NSCAFG00000019677 
NSCAFG00000019680 
NSCAFG00000019682 
NSCAFG00000019684 
NSCAFG00000019687 
NSCAFG00000023874 


NSCAFG0000000525 1 


RBM26 
NDFIP2 


COPG2 
MEST 
TSGA14 


KCNS2 


ARHGDIG 
PDIA2 
DECR2 


TMEMSA 
MRPL28 
AXINI 
ITFG3 
RGS11 


RALGAPA2 
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Table S25. Enriched GO-categories among genes residing in wolf candidate 
selected regions (wolf CSRs). 


Group Total 

Ensembl id Genes count count P-value GO term 
ENSCAFG00000019430 TBL3 
ENSCAFG00000003662 TBC1D9 
ENSCAFG00000005540 CANT1 
ENSCAFG00000000484 STK3 
ENSCAFG00000019495 SPSB3 
ENSCAFG00000019513 MAPK8IP3 
ENSCAFG00000015429 PDIA2 
ENSCAFG00000015363 ARHGDIG 
ENSCAFG00000005256 NDFIP2 
ENSCAFG00000019418 TRAF7 
ENSCAFG00000019438 TSC2 
ENSCAFG00000019392 RAB26 
ENSCAFG00000005528 TIMP2 

intracellular 
ENSCAFG00000023874 RGS11 14 1965 0.0184 signaling cascade 
ENSCAFG00000019418 TRAF7 

exocrine system 
ENSCAFG00000019392 RAB26 2 8 0.0184 development 
ENSCAFG00000005256 NDFIP2 
ENSCAFG00000019418 TRAF7 
ENSCAFG00000005528 TIMP2 
ENSCAFG00000005540 CANT1 
ENSCAFG00000000484 STK3 

protein kinase 
ENSCAFG00000019513 MAPK8IP3 6 376 0.0188 cascade 
ENSCAFG00000005256 NDFIP2 
ENSCAFG00000003662 TBC1D9 
ENSCAFG00000019418 TRAF7 
ENSCAFG00000019438 TSC2 
ENSCAFG00000005528 TIMP2 
ENSCAFG00000005540 CANT1 
ENSCAFG00000023874 RGS11 

regulation of signal 
ENSCAFG00000019513 MAPK8IP3 8 800 0.033 transduction 
ENSCAFG00000019418 TRAF7 

regulation of 
ENSCAFG00000005528 TIMP2 2 15 0.0341 _MAPKKK cascade 
ENSCAFG00000019418 TRAF7 

regulation of 
ENSCAFG00000019392 RAB26 2 15 0.0341 exocytosis 
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